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Abstract 

We consider Gaussian multiple-input multiple-output (MIMO) channels with discrete input alphabets. We propose 
a non-diagonal precoder based on the X-Codes in (T) to increase the mutual information. The MIMO channel is 
transformed into a set of parallel subchannels using Singular Value Decomposition (SVD) and X-Codes are then used 
to pair the subchannels. X-Codes are fully characterized by the pairings and a 2 x 2 real rotation matrix for each pair 
(parameterized with a single angle). This precoding structure enables us to express the total mutual information as a 
sum of the mutual information of all the pairs. The problem of finding the optimal precoder with the above structure, 
which maximizes the total mutual information, is solved by i) optimizing the rotation angle and the power allocation 
within each pair and ii) finding the optimal pairing and power allocation among the pairs. It is shown that the mutual 
information achieved with the proposed pairing scheme is very close to that achieved with the optimal precoder 
by Cruz et al., and is significantly better than Mercury /waterfilling strategy by Lozano et al.. Our approach greatly 
simplifies both the precoder optimization and the detection complexity, making it suitable for practical applications. 

Index Terms 

Mutual information, MIMO, OFDM, precoding, singular value decomposition, condition number. 

I. Introduction 

Many modern communication channels are modeled as a Gaussian multiple-input multiple-output (MIMO) 
channel. Examples include multi-tone digital subscriber line (DSL), orthogonal frequency division multiplexing 
(OFDM) and multiple transmit-receive antenna systems. It is known that the capacity of the Gaussian MIMO 
channel is achieved by beamforming a Gaussian input alphabet along the right singular vectors of the MIMO 
channel. The received vector is projected along the left singular vectors, resulting in a set of parallel Gaussian 
subchannels. Optimal power allocation between the subchannels is achieved by waterfilling Q. In practice, the 
input alphabet is not Gaussian and is generally chosen from a finite signal set. 
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We distinguish between two kinds of MIMO channels: i) diagonal (or parallel) channels and if) non-diagonal 
channels. 

For a diagonal MIMO channel with discrete input alphabets, assuming only power allocation on each subchannel 
(i.e., a diagonal precoder), Mercury /waterfilling was shown to be optimal by Lozano et al. in |3_). With discrete 
input alphabets, Cruz et al. later proved in |4|] that the optimal precoder is, however, non-diagonal, i.e., precoding 
needs to be performed across all the subchannels. 

For a general non-diagonal Gaussian MIMO channel, it was also shown in [4| that the optimal precoder is non- 
diagonal. Such an optimal precoder is given by a fixed point equation, which requires a high complexity numeric 
evaluation. Since the precoder jointly codes all the n inputs, joint decoding is also required at the receiver. Thus, 
the decoding complexity can be very high, specially for large n, as in the case of DSL and OFDM applications. 
This motivates our quest for a practical low complexity precoding scheme achieving near optimal capacity. 

In this paper, we consider a general MIMO channel and a non-diagonal precoder based on X-Codes (TJ. The 
MIMO channel is transformed into a set of parallel subchannels using Singular Value Decomposition (SVD) and X- 
Codes are then used to pair the subchannels. X-Codes are fully characterized by the pairings and the 2-dimensional 
real rotation matrices for each pair. These rotation matrices are parameterized with a single angle. This precoding 
structure enables us to express the total mutual information as a sum of the mutual information of all the pairs. 

The problem of finding the optimal precoder with the above structure, which maximizes the total mutual 
information, can be split into two tractable problems: ;) optimizing the rotation angle and the power allocation 
within each pair and ii) finding the optimal pairing and power allocation among the pairs. It is shown by simulation 
that the mutual information achieved with the proposed pairing scheme is very close to that achieved with the 
optimal precoder in ]4), and is significantly better than the Mercury /waterfilling strategy in J3). Our approach 
greatly simplifies both the precoder optimization and the detection complexity, making it suitable for practical 
applications. 

The rest of the paper is organized as follows. Section Q]] introduces the system model and SVD precoding. 
In Section [HI] we provide a brief review of the optimal precoding with discrete inputs in J4) and the relevant 
MIMO capacity. In Section IIVI we present the precoding using X-Codes with discrete inputs and the relevant 
capacity expressions. In Section [V] we consider the first problem, which is to find the optimal rotation angle and 
power allocation for a given pair. This problem is equivalent to optimizing the mutual information for a Gaussian 
MIMO channel with two subchannels. In Section [VI] using the results from Section [V] we attempt to optimize the 
mutual information for a Gaussian MIMO channel with n subchannels, where n > 2. Conclusions are drawn in 
Section rvIIII Finallv. in Section [VTI1 we discuss the application of our precoding to OFDM systems. 

Notations: The field of complex numbers is denoted by C and let R + be the positive real numbers. Superscripts 
T and ' denote transposition and Hermitian transposition, respectively. The n x n identity matrix is denoted by I„, 
and the zero matrix is denoted by 0. The E[ ] is the expectation operator, || ■ || denotes the Euclidean norm of a 
vector, and || • \\p the Frobenius norm of a matrix. Finally, we let tr(-) be the trace of a matrix. 
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II. System model and Precoding with Gaussian inputs 

We consider a n t x n r MIMO channel, where the channel state information (CSI) is known perfectly at both 
transmitter and receiver. Let x = [x\, ■ ■ ■ , x nt ) T be the vector of input symbols to the channel, and let H = {hij}, 
i = 1, • ■ ■ , n r , j = 1, ■ ■ ■ , n t , be a full rank n r x n t channel coefficient matrix, with hij representing the complex 
channel gain between the j-th input symbol and the z-th output symbol. The vector of n r channel output symbols 
is given by 

y = v^Pr Hx + w (!) 

where w is an uncorrelated Gaussian noise vector, such that E[ww'] = I„ r , and Pt is the total transmitted power. 
The power constraint is given by 

E[||x|| 2 ] = 1 (2) 

The maximum multiplexing gain of this channel is n = min(n r , nt). Let u = (ui, ■ ■ ■ ,u n ) T G C™ be the vector 
of n information symbols to be sent through the MIMO channel, with E[|u;| 2 ] = 1, i = 1, • • • , n. Then the vector 
u can be precoded using a n t x n matrix T, resulting in x = Tu. 

The capacity of the deterministic Gaussian MIMO channel is then achieved by solving 

Problem 1: 

C(H,P T ) = max J(x;y|H) (3) 
K x |tr(K x )=i 

> max I(u;y|H) 

K U ,T | tT(TK u Tt) = l 

where I(x;y|H) is the mutual information between x and y, and K x = E[xx^], K u = E[uu^] are the covariance 
matrices of x and u respectively. The inequality in ^ follows from the data processing inequality J2). 

Let us consider the singular value decomposition (SVD) of the channel H = UAV, where U G C rXn , 
A e C nx ", V G C" xnt , UtU = VVt = I n , and A = diag(Ai, . . . , A„) with Ai > A 2 , • • • , > A„ > 0. 

Telatar showed in |5| that the Gaussian MIMO capacity C(H,Pt), is achieved when x is Gaussian distributed 
and TK X T^ is diagonal. Diagonal TK X T^ can be achieved by using the optimal precoder matrix T = V^P, 
where P G (R + ) n is the diagonal power allocation matrix such that tr(PP^) = 1. Furthermore, Ui,i = l,...,n, 
are i.i.d. Gaussian (i.e., no coding is required across the input symbols Ui). With this, the second line of (0) is 
actually an equality. Also, projecting the received vector y along the columns of U is information lossless and 
transforms the non-diagonal MIMO channel into an equivalent diagonal channel with n non-interfering subchannels. 
The equivalent diagonal system model is then given by 

r = U f y = >/PrAPu + w (4) 

where w is the equivalent noise vector, and has the same statistics as w. The total mutual information is now given 
by 

n 

J(x;y|H) =^log 2 (l + X l 2 p^P T ) (5) 

i=l 
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Note that now the mutual information is a function of only the power allocation matrix P = diag(pi, . . . ,p n ), 
with the constraint tr(PPt) = 1. Optimal power allocation is achieved through waterfilling between the n parallel 
channels of the equivalent system in <j4j |2j|. 



III. Optimal precoding with discrete inputs 

In practice, discrete input alphabets are used. Subsequently, we assume that the i-th information symbol is given 
by Ui G Ui, where W t C C is a finite signal set. Let S = U\ x U2 x • • ■ x U n be the overall input alphabet. The 
capacity of the Gaussian MIMO channel with discrete input alphabet S is defined by the following problem 

Problem 2: 

C7s(H,P T )= max I(u;y|H) (6) 

T I ueS,||T|| F = l 

Note that there is no maximization over the pdf of u, since we fix K u = I„. The optimal precoder T*, which 
solves Problem |2 is given by the following fixed point equation given in JfQ 

HtHTE 

||HtHT*E|| F V ; 

where E is the minimum mean-square error (MMSE) matrix of u given by 

E = E[(u-E[u|y])(u-E[u|y])t] (8) 

The optimal precoder is derived using the relation between MMSE and mutual information J6). We observe that, 
with discrete input alphabets, it is no longer optimal to beamform along the column vectors of and then use 
waterfilling on the parallel subchannels. Even when H is diagonal (parallel non-interfering subchannels), the optimal 
precoder T* is non diagonal, and can be computed numerically (using a gradient based method) as discussed in 
fl4]. However, the complexity of computing T* is prohibitively high for practical applications, especially when n 
is large and/or the channel changes frequently. 

We propose a suboptimal precoding scheme based on X-Codes JT], which achieves close to the optimal capacity 
Cs(H, Pt), at low encoding and decoding complexities. 



IV. Precoding with X-Codes 

X-Codes are based on a pairing of n subchannels £ = {(ik,jk) £ [L™] x [L^U- < jk, k = 1,. . . n/2}. For a 
given n, there are (n — l)(n — 3) • • • 3 1 possible pairings. Let C denote the set of all possible pairings. For example, 
with n = 4, we have 

C = {{(1,4), (2,3)} , {(1,2), (3,4)} , {(1,3), (2,4)}} 

X-Codes are generated by a n x n real orthogonal matrix, denoted by G. When precoding with X-Codes, the 
precoder matrix is given by T = V^PG, where P = diag(pi,p2, ■ • • ,Pn) € (K + )" is the diagonal power allocation 
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k = l,...n/2 



(9) 



matrix such that tr(PP^) = 1. The fc-th pair consists of subchannels ik and jk- For the fc-th pair, the information 
symbols u.- lk and Uj k are jointly coded using a 2 x 2 real orthogonal matrix given by 

cos(6» fc ) sin(0 fc ) 
-sin(0 fe ) cos(0fe) 

The angle 0£ can be chosen to maximize the mutual information for the fc-th pair. Each A*, is a submatrix of the 
code matrix G = (gtj) as shown below 

9i k ,ik = cos(0 fe ) 5i fcJfc = sin(0 fe ) 

'/;. .<. = ~ sin(0 A ) g jkdk = cos(6» fe ) 
It was shown in 1 1 ] that, for achieving the best diversity gain, an optimal pairing is one in which the fc-th subchannel 
is paired with the (n— fc+ l)-th subchannel. For example, with this pairing and n = 6, the X-Code generator matrix 
is given by 

cos(#i) sin(6>i) 
cos(6>2) sin(6>2) 
cos(03) sin(6»3) 
-sin(f? 3 ) cos(0 3 ) 
— sin(#2) cos(02) 
— sin(0i) cos(6*i) 

The special case with 6k = 0, k = 1, 2, ■ ■ • , n/2, results in no coding across subchannels. 

Given the generator matrix G, the subchannel gains A, and the power allocation matrix P, the mutual information 
between u and y is given by 



J s (u; y|A, P, G) = h(y\A, P, G) - ft(w) 
= - / p(y|A, P, G) log 2 (p(y|A, P, G))dy - n log 2 (7re) 



(11) 



where the received vector pdf is given by 

p(y|A,P,G) 



15 



uG5 



and when n = n r (i.e., n r < n t ), it is equivalently given by 



p(y|A,P,G) 



Sir 



-y 



/PHJAPGull 



ApFAPGull 



(12) 



(13) 



u£5 



where r = (ri,r 2 , • • ■ ,r n ) T = Uty. 

We next define the capacity of the MIMO Gaussian channel when precoding with G. In the following, we 
assume that n r < n t , so that is(u; y|A, P, G) = ls(u; r|A, P, G). Note that, when n r > n t , the receiver 
processing r = U^y becomes information lossy, and Is(u; y|A, P, G) > is(u; r| A, P, G). 

We introduce the following definitions. For a given pairing I, let = (r^ , Tj k ) T , u/j = (ui kl Uj k ) T , A& = 
diag(A; fc , A Jfc ), P^ = diag(pj fc ,p., fc ) and 5fc = x ^j k - Due to the pairing structure of G the mutual information 
is(u; r| A, P, G) can be expressed as the sum of mutual information of all the n/2 pairs as follows: 



n/2 



/ 5 (u;r|A,P,G) = £ I Sk (u fc ; r fc |A fc , P fe , fc ) 



(14) 



fe=i 
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Having fixed the precoder structure to T = V^PG, we can formulate the following 
Problem 3: 

CxCR,Pt)= max / s (u;r|A,P,G) (15) 

g,p | ues,tr(ppt)=i 

It is clear that the solution of the above problem is still a formidable task, although it is simpler than Problem [2] 
In fact, instead of the nx n variables of T, we now deal with n variables for power allocation in P, n/2 variables 
for the angles defining A k , and the pairing I e C. In the following, we will show how to efficiently solve Problem 
[3] by splitting it into two simpler problems. 

Power allocation can be divided into power allocation among the n/2 pairs, followed by power allocation between 
the two subchannels of each pair. Let P = diag(pi,p2, ■ ■ ■ ,Pn/2) be a diagonal matrix, where p k = \Jpf k + Pj k 
with p\ being the power allocated to the fc-th pair. The power allocation within each pair can be simply expressed in 
terms of the fraction f k = p\ k /p\ of the power assigned to the first subchannel of the pair. The mutual information 
achieved by the fc-th pair is then given by 

Zs fe (u fc ;rfc|Afc,P fe ,0fc) = I Sk (u fc ; r k \A k ,p k , f k , 6 k ) (16) 
= - / p(rfc)log 2 p(rfc) dr k - 21og 2 (7re) 



where p(r k ) is given by 



Mr fe ) = ^T^ E e-ll'*-^* A * F * A * u *ll a (17) 



where F fc = diag(V7fc, VI - 7k) and A fc is given by ©. 

The capacity of the discrete input MIMO Gaussian channel when precoding with X-Codes can be expressed as 
Problem 4: 

n/2 

C X (H,P T )= max Y^C Sk (k,e,p k ) (18) 
•ee£,p|tr(ppt)=i J 

where Cs k {k,£,p k ), the capacity of the fc-th pair in the pairing I, is achieved by solving 
Problem 5: 

C Sk (k,t,p k ) = max I Sk (u k ; r k \A k , pk, fk, Ok) (19) 

@k,fk 

In other words, we have split Problem [3] into two different simpler problems. Firstly, given a pairing £ and power 
allocation between pairs P, we can solve Problem for each k = 1, 2, ■ • • , n/2. Problem |4] uses the solution to 
Problem [5] to find the optimal pairing I* and the optimal power allocation P* between the n/2 pairs. For small n, 
the optimal pairing and power allocation between pairs can always be computed numerically and by brute force 
enumeration of all possible pairings. This is, however, prohibitively complex for large n, and we shall discuss 
heuristic approaches in Section [VTl 

We will show in the following that, although suboptimal, precoding with X-Codes will provide a close to optimal 
capacity with the additional benefit that the detection complexity at the receiver is highly reduced, since there is 
coupling only between pairs of channels, as compared to the case of full-coupling for the optimal precoder in J4). 
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In the next section, we solve Problem [5] which is equivalent to finding the optimal rotation angle and power 
allocation for a Gaussian MIMO channel with only n = 2 subchannels. 

V. Gaussian MIMO channels with n = 2 

With n = 2, there is only one pair and only one possible pairing. Therefore, we drop the subscript k in Problem 
[5] and we find Cx(H, Pp) in Problem [3] The processed received vector r € C 2 is given by 



where z = U^w is the equivalent noise vector with the same statistics as w. Let a = A 2 + A 2 , be the overall 
channel power gain and (3 = A1/A2 be the condition number of the channel. Then d2"0l can be re-written as 



where Pp = Ptol and A = A/y/a = diag(/3/ y/l + (3 2 , lj \]l + (3 2 ). The equivalent channel A now has a gain 
of 1, and its channel gains are dependent only upon (3. Our goal is, therefore, to find the optimal rotation angle 
9* and the fractional power allocation /*, which maximize the mutual information of the equivalent channel with 
condition number /3 and gain a = 1. The total available transmit power is now Pp. 

It is difficult to get analytic expressions for the optimal 9* and /*, and therefore we can use numerical techniques 
to evaluate them and store them in lookup tables to be used at run time. For a given application scenario, given the 
distribution of (3, we decide upon a few discrete values of (3 which are representative of the actual values observed 
in real channels. For each such quantized value of (3, we numerically compute a table of the optimal values /* and 
9* as a function of Pp. These tables are constructed offline. During the process of communication, the transmitter 
knows the value of a and (3 from channel measurements. It then finds the lookup table with the closest value of (3 
to the measured one. The optimal values /* and 9* are then found by indexing the appropriate entry in the table 
with Pt equal to Ptu. 

In Fig. Q] we graphically plot the optimal power fraction /* to be allocated to the stronger channel in the pair, as 
a function of Pt- The input alphabet is 16-QAM and (3=1, 1.5, 2, 4, 8. For (3 = 1, both channels have equal gains, 
and therefore, as expected, the optimal power allocation is to divide power equally between the two subchannels. 
However with increasing (3, the power allocation becomes more asymmetrical. It is observed that at low Pt it is 
optimal to allocate all power to the stronger channel. At high Pt the opposite is true, and it is the weaker channel 
which gets most of the power. For a fixed (3, as Pt increases, the power allocated to the stronger channel is shifted 
to the weaker channel. For a fixed Pt, a higher fraction of the total power is allocated to the weaker channel with 
increasing (3. In the high Pt regime, these results are in contrast with the waterfilling scheme, where almost all 
subchannels are allocated equal power. 

In Fig. 12 the optimal rotation angle 9* is plotted as a function of Pp. The input alphabet is 16-QAM and 
(3 = 1.5,2,4,8. For (3 = 1 the mutual information is independent of 9 for all values of Pp. For (3 = 1.5,2, 
the optimal rotation angle is almost invariant to Pp. For larger (3, the optimal rotation angle varies with Pt and 
approximately ranges between 30 — 40° for all Pt values of interest. 
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Fig. 1. Plot of /* versus Pt for n = 2 parallel channels with /3 = 1, 1.5, 2, 4, 8 and a = 1. Input alphabet is 16-QAM. 



Fig. [3] shows the variation of the mutual information with the power fraction / for a = 1. The power Pt is 
fixed at 17 dB and the input alphabet is 16-QAM. We observe that for all values of /3, the mutual information is 
a concave function of /. We also observe that the sensitivity of the mutual information to variation in / increases 
with increasing f3. However, for all f3, the mutual information is fairly stable (has a "plateau") around the optimal 
power fraction. This is good for practical implementation, since this implies that an error in choosing the correct 
power allocation would result in a very small loss in the achieved mutual information. 

In Fig. 2] we plot the variation of the mutual information w.r.t. the rotation angle 9. The power Pt is fixed 
at 17 dB and the input alphabet is 16-QAM. For (3 = 1, the mutual information is obviously constant with 0. 
With increasing (3, mutual information is observed to be increasingly sensitive to 9. However, when compared with 
Fig. |3] it can also be seen that the mutual information appears to be more sensitive to the power allocation fraction 
/, than to 9. 

In Fig. [5] we plot the mutual information of X-Codes for different rotation angles with a = 1 and f3 = 2. For 
each rotation angle, the power allocation is optimized numerically. We observe that, the mutual information is quite 
sensitive to the rotation angle except in the range 30-40°. 

We next present some simulation results to show that indeed our simple precoding scheme can significantly in- 
crease the mutual information, compared to the case of no precoding across subchannels (i.e., Mercury/waterfilling). 
For the sake of comparison, we also present the mutual information achieved by the waterfilling scheme with discrete 
input alphabets. 
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Fig. 2. Plot of 9* versus Pt for n = 2 parallel channels with /3 = 1.5, 2, 4, 8 and a = 1. Input alphabet is 16-QAM. 

We restrict the discrete input alphabets Ui , i = 1 , 2, to be square A/-QAM alphabets consisting of two s/M -PAM 
alphabets in quadrature. Mutual information is evaluated by solving Problem [5] (i.e., numerically maximizing w.r.t. 
the rotation angle and power allocation). 

In Fig. [6j we plot the maximal mutual information versus Pt, for a system with two subchannels, /3 = 2 and 
a = 1. Mutual information is plotted for 4- and 16-QAM signal sets. It is observed that for a given achievable mutual 
information, coding across subchannels is more power efficient. For example, with 4-QAM and an achievable mutual 
information of 3 bits, X-Codes require only 0.8 dB more transmit power when compared to the ideal Gaussian 
signalling with waterfilling. This gap increases to 1.9 dB for Mercury/waterfilling and 2.8 dB for the waterfilling 
scheme with 4-QAM as the input alphabet. A similar trend is observed with 16-QAM as the input alphabet. The 
proposed precoder clearly performs better, since the mutual information is optimized w.r.t. the rotation angle 9 and 
power allocation, while Mercury/waterfilling, as a special case of X-Code, only optimizes power allocation and 
fixes 9 = 0. 

In Fig. |7l we compare the mutual information achieved by X-Codes and the Mercury/waterfilling strategy for 
a = 1 and f3 = 1,2,4. The input alphabet is 4-QAM. It is observed that both the schemes have the same 
mutual information when j3 = 1. However with increasing (3, the mutual information of Mercury/waterfilling 
strategy is observed to degrade significantly at high Py, whereas the performance of X-Codes does not vary as 
much. The degradation of mutual information for the Mercury/waterfilling strategy is explained as follows. For the 
Mercury/waterfilling strategy, with increasing /3, all the available power is allocated to the stronger channel till a 
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Fig. 3. Mutual Information of X-Codes versus power allocation fraction / for n = 2 parallel channels with ,8 = 1, 1.5,2,4,8, a = 1 and 
P T = 17 dB. Input alphabet is 16-QAM. 

certain transmit power threshold. However, since finite signal sets are used, mutual information is bounded from 
above until the transmit power exceeds this threshold. This also explains the reason for the intermediate change 
of slope in the mutual information curve with fj = 4 (see the rightmost curve in Fig. |7). On the other hand, due 
to coding across subchannels, this problem does not arise when precoding with X-Codes. Therefore, in terms of 
achievable mutual information, rotation coding is observed to be more robust to ill-conditioned channels. 

For low values of Pp, mutual information of both the schemes are similar, and improves with increasing f3. This 
is due to the fact that, at low Pp, mutual information increases linearly with Pp, and therefore all power is assigned 
to the stronger channel. With increasing /?, the stronger channel has an increasing fraction of the total channel gain, 
which results in increased mutual information. 

In Fig. [8] the mutual information with X-Codes is plotted for f3 = 1,2,4,8 and with 16-QAM as the input 
alphabet. It is observed that at low values of Pp, a higher value of (3 is favorable. However at high Pp, with 16- 
QAM input alphabets, the performance degrades with increasing /3. This degradation is more significant compared 
to the degradation observed with 4-QAM input alphabets. Therefore it can be concluded that the mutual information 
is more sensitive to /3 with 16-QAM input alphabets as compared to 4-QAM. 

VI. Gaussian MIMO channels with n > 2 

We now consider the problem of finding the optimal pairing and power allocation between pairs for different 
Gaussian MIMO channels with even n and n > 2. We first observe that mutual information is indeed sensitive to 



February 23, 2010 



DRAFT 



11 




5 10 15 20 25 30 35 40 45 

e (deg) 

Fig. 4. Mutual information of X-Codes versus rotation angle 9 for n = 2 parallel channels with /? = 1, 1.5, 2, 4, 8, a = 1 and = 17 dB. 
Input alphabet is 16-QAM. 

the chosen pairing, and this therefore justifies the criticality of computing the optimal pairing. This is illustrated 
through Fig. [9] for n = 4 with a diagonal channel A = diag(0.8, 0.4, 0.4, 0.2) and 16-QAM. Optimal power 
allocation between the two pairs is computed numerically. It is observed that the pairing {(1,4), (2,3)} performs 
significantly better than the pairing {(1,3), (2,4)}. 

In Fig. [TO] we compare the mutual information achieved with optimal precoding [4], to that achieved by the 
proposed precoder with 4-QAM input alphabet. The 4x4 full channel matrix (non-diagonal channel) is given 
by (42) in @). For X-Codes, the optimal pairing is {(1,4), (2,3)} and the optimal power allocation between the 
pairs is computed numerically. It is observed that X-Codes perform very close to the optimal precoding scheme. 
Specifically, for an achievable mutual information of 6 bits, compared to the optimal precoder |4|, X-Codes need 
only 0.4dB extra power whereas 2.3dB extra power is required with Mercury/waterfilling. 

Another application is in wireless MIMO channels with perfect channel state information at both the transmitter 
and receiver. The channel coefficients are modeled as i.i.d complex normal random variables with unit variance. 

In Fig. Q~T] we plot the ergodic capacity (i.e., the mutual information averaged over channel realizations) for a 4 x 4 
wireless MIMO channel. For X-Codes, the best pairing and power allocation between pairs are chosen numerically 
using the optimal 9 and power fraction tables created offline. It is observed that at high Pt, simple rotation based 
coding using X-Codes improves the mutual information significantly, when compared to Mercury/waterfilling. For 
example, for a target mutual information of 12 bits, X-Codes perform 1.2dB away from the idealistic Gaussian 
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signalling scheme. This gap from the Gaussian signalling scheme increases to 3.1dB for the Mercury/waterfilling 
scheme and to 4.4dB for the waterfilling scheme with 16-QAM alphabets. 

In this application scenario the low complexity of our precoding scheme becomes an essential feature, since the 
precoder can be computed on the fly using the look-up tables for each channel realization. 

VII. Application to OFDM 

In OFDM applications, n is large and Problem [4] becomes too complex to solve, since we can no more find the 
optimal pairing by enumeration. 

It was observed in Section [V] that for n = 2, a larger value of the condition number f3 leads to a higher mutual 
information at low values of Pt (low SNR). Therefore, we conjecture that pairing the fc-th subchannel with the 
(n/2 + fc)-th subchannel could have mutual information very close to optimal, since this pairing scheme attempts 
to maximize the minimum f3 among all pairs. We shall call this scheme the "conjectured" pairing scheme, and the 
X-Code scheme, which pairs the fc-th with the (n — k + l)-th subchannel, the "X-pairing" scheme. Note that the 
"X-pairing" scheme was proposed in H] as a scheme which achieved the optimal diversity gain when precoding 
with X-Codes. 

Given a pairing of subchannels, it is also difficult to compute the optimal power allocation between pairs P. 
However, it was observed that for channels with large n, even waterfilling power allocation between the pairs (with 
afe = a Xf k + Xj k as the channel gain of the fc-th pair) results in good performance. 
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Fig. 6. Mutual information versus Pt for n = 2 parallel channels with ft = 2 and a = 1, for 4-QAM and 16-QAM. 



Apart from the "conjectured" and the "X-pairing" schemes, we propose the following scheme which is based on 
the "Hungarian" assignment algorithm (7) and which attempts to find a good approximation to the optimal pairing. 
We shall call this as the "Hungarian" pairing scheme. Before describing the "Hungarian" pairing scheme, we briefly 
review the Hungarian assignment problem as follows. 

Consider m different workers and m different jobs that have to be completed. Also let C(i,j) be the cost involved 
when the i-th worker is assigned to the j-th job. We can therefore think of a cost matrix, whose (i, j)-fh entry 
has the value C(i,j). The Hungarian assignment problem, is to then find the optimal assignment of workers to 
jobs (each worker getting assigned to exactly one job) such that the total cost of getting all the jobs completed 
is minimized. It is easy to see, that a maximization job assignment problem could be posed into a minimization 
problem and vice versa. 

To find a good approximation to the optimal pairing, we split the n subchannels into two groups i) Group-I 
: subchannels 1 to n/2, with the j-th subchannel in the role of the j-th job (j = 1,2, ■■■n/2), ii) Group-II : 
subchannels n/2 + 1 to n, with the (n/2 + i)-th subchannel in the role of the i-th worker (i = 1,2, • • - n/2). 
Therefore, there are n/2 workers and jobs. 

For a given SNR Pt, we initially assume uniform power allocation between all pairs and therefore assign a 
power of 2Pt/u to each pair. The value of C(i,j) is evaluated by finding the optimal mutual information achieved 
by an equivalent n = 2 channel with the n/2 + i-th and the j-th subchannels as its two subchannels. This can be 
obtained by first choosing a table (see Section [V) with the closest value of (3 to the given Xj/X n /2+i, an d men 
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Fig. 7. Mutual information versus Pt for n = 2 parallel channels with varying /3 = 1, 2, 4, a = 1 and 4-QAM input alphabet. 



indexing the appropriate entry into the table with SNR=2Pp(A^ + A^ 9+i )/n. The Hungarian algorithm then finds 
the pairing with the highest mutual information. Power allocation between the pairs is then achieved through the 
waterfilling scheme. 

It was observed through monte-carlo simulations that, even uniform power allocation between the subchannels 
results in almost same mutual information as achieved through waterfilling between pairs. This can be explained 
from the fact that by separating into a group of stronger (Group-I) and a group of weaker channels (Group-II), 
any pairing would result in all pairs having almost the same channel gain a^. This therefore implies that the 
optimal power allocation scheme would allocate nearly equal power to all pairs, which both the uniform and 
the waterfilling schemes would also do. Henceforth, it can be conjectured that with the proposed separation of 
subchannels into 2 groups, both the uniform and the waterfilling power allocation schemes would have close to 
optimal performance, and any further improvement in mutual information by optimizing the power allocation would 
be minimal. This also supports the initial usage of uniform power 2Pr/n to compute the entries C(i,j) before 
executing the Hungarian algorithm. Furthermore, the computational complexity of the Hungarian algorithm is 0(n 3 ) 
and is therefore practically feasible. 

To study the sensitivity of the mutual information to the pairing of subchannels, we also consider a "Random" 
pairing scheme. In the "Random" pairing scheme, we first choose a large number (rj 50) of random pairings. For 
each chosen random pairing we evaluate the mutual information (through monte-carlo simulations) with waterfilling 
power allocation between pairs. Finally the average mutual information is computed. This gives us insight into the 
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Fig. 8. Mutual information with X-Codes versus Pt for n = 2 parallel channels with varying /3 = 1, 2, 4, 8, a = 1 and 16-QAM input 
alphabet. 

mean value of the mutual information w.r.t. pairing. It would also help us in quantifying the effectiveness of the 
heuristic pairing schemes discussed above. 

We next illustrate the mutual information achieved by these heuristic schemes for an OFDM system with 
n = 32 subchannels and 16-QAM. The channel impulse response is [-0.454 + j0.145, -0.258 + jO. 198, 0.0783 + 
j0.069, — 0.408 — jO. 396, — 0.532 — jO. 224]. For the "conjectured" and the "X-pairing" schemes also, power allocation 
is achieved through waterfilling between the pairs. 

In Fig. Q~2]the total mutual information is plotted as a function of the SNR per sub carrier. It is observed that the 
proposed precoding scheme performs much better than the Mercury/waterfilling scheme. The proposed precoder 
with the "Hungarian" pairing scheme performs within 1.1 dB of the Gaussian signalling scheme for an achievable 
total mutual information of 96 bits (i.e., a rate of 96/128 = 3/4). The proposed precoder with the "Hungarian" pairing 
scheme performs about 1.6dB better than the Mercury/waterfilling scheme. The "X-pairing" scheme performs better 
than the Mercury/waterfilling and worse than the "Hungarian" pairing scheme. Even at a low rate of 1/2 (i.e., a 
total mutual information of 64 bits), the proposed precoder with the "Hungarian" pairing scheme performs about 
0.7dB better than the Mercury/waterfilling scheme. 

In Fig. [13] we compare the mutual information achieved by the various heuristic pairing schemes. It is observed 
that the "conjectured" pairing scheme performs very close to the "Hungarian" pairing scheme except at very high 
SNR. For example, even for a high mutual information of 96 bits, the "Hungarian" pairing scheme performs better 
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Fig. 9. Mutual information versus Pt with two different pairings for a n = 4 diagonal channel and 16-QAM input alphabet. 

than the "conjectured" pairing scheme by only about 0.2dB. However at very high rates (like 7/8 and above), 
the "Hungarian" pairing scheme is observed to perform better than the "conjectured" pairing scheme by about 
0.7dB. Therefore for low to medium rates, it would be better to use the "conjectured" pairing since it has the 
same performance at a lower computational complexity. The mutual information achieved by the "Random" pairing 
scheme is observed to be strictly inferior than the "conjectured" pairing scheme at all values of SNR, and at low 
SNR it is even worse than the Mercury/waterfilling strategy. This, therefore implies that the total mutual information 
is indeed sensitive to the chosen pairing. Further, till a rate of 1/2 (i.e., a mutual information of 64 bits) it appears 
that any extra optimization effort would not result in significant performance improvement for the "conjectured" 
pairing scheme, since it is already very close to the idealistic Gaussian signalling schemes. However at higher rate 
and SNR it may still be possible to improve the mutual information by further optimizing the selection of pairing 
scheme and power allocation between pairs. This is however a difficult problem that requires further investigation. 

VIII. Conclusions 

In this paper, we proposed a low complexity precoding scheme based on the pairing of subchannels, which 
achieves near optimal capacity for Gaussian MIMO channels with discrete inputs. The low complexity feature 
relates to both the evaluation of the optimal precoder matrix and the detection at the receiver. This makes the 
proposed scheme suitable for practical applications, even when the channels are time varying and the precoder 
needs to be computed for each channel realization. 
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Fig. 10. Mutual information versus Pt for the Gigabit DSL channel given by (42) in (4). 



The simple precoder structure, inspired by the X-Codes, enabled us to split the precoder optimization problem 
into two simpler problems. Firstly, for a given pairing and power allocation between pairs, we need to find the 
optimal power fraction allocation and rotation angle for each pair. Given the solution to the first problem, the second 
problem is then to find the optimal pairing and the power allocation between pairs. 

For large n, typical of OFDM systems, we also discussed different heuristic approaches for optimizing the pairing 
of subchannels. 

The proposed precoder was shown to perform better than the Mercury/waterfilling strategy for both diagonal and 
non-diagonal MIMO channels. Future work will focus on finding close to optimal pairings, and close to optimal 
power allocation strategies between pairs. 
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Fig. 12. Mutual information versus per subcarrier SNR for an OFDM system with 32 carriers. X-Codes versus Mercury Avaterfilling. 
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Fig. 13. Mutual information versus per subcarrier SNR for an OFDM system with 32 carriers. Comparison of heuristic pairing schemes. 
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